Evaluation of Web Retrieval Methods Using Anchor Text

نویسندگان

  • Kenji Tateishi
  • Hideki Kawai
  • Susumu Akamine
  • Katsushi Matsuda
  • Toshikazu Fukushima
چکیده

In this paper, we evaluate two types of anchor texts: a page anchor and a site anchor. Since the anchor text tends to summarize information referred ahead, it can be expected that the terms appearing there have important meaning in information retrieval. We introduce a retrieval method to give high priority to the terms in the anchor text. In the experiment, we compared the proposed method with the base line which indexed only page documents. The result indicated that both methods had almost the same accuracy, and that there were many queries in which the accuracy much differed between two methods. It can be expected that the improvement on the queries in which the proposed method was inferior to base line will be achieved by deleting overlapped anchor texts toward the same page.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Anchor Text for the Navigational Web Retrieval at NTCIR-5

In the Navigational Retrieval Subtask 2 (Navi-2) at the NTCIR-5 WEB Task, a hypothetical user knows a specific item (e.g., a product, company, and person) and requires to find one or more representative Web pages related to the item. This paper describes our system participated in the Navi-2 subtask and reports the evaluation results of our system. Our system uses three types of information obt...

متن کامل

Verification of Effective Retrieval Method for Anchor Text on Navigational Retrieval

We participated in NTCIR-5 WEB Navigational Retrieval Subtask(Navi-2) in order to verify the most effective retrieval method for the index of anchor texts by using a retrieval system that indexed only anchor texts instead of full texts of Web pages. We introduced retrieval methods that combine one or more of six retrieval measures: (a) anchor frequency (af), (b) reference consistency (rc), (c) ...

متن کامل

Using Document Structure on Retrieving Webpages at the Web-CLEF 2006

We present a report on our participation in the mixed monolingual web task of the 2006 Cross-Language Evaluation Forum (CLEF). We compared the result of web page retrieval based on the page content, page title, and anchor page. The retrieval effectiveness for the combination of page content, page title, and anchor texts was better than that of the combination of page title and page title only. ...

متن کامل

Application of Localized Similarity for Web Documents

In this paper we present a novel approach to automatic creation of anchor texts for hyperlinks in a document pointing to similar documents. Methods used in this approach rank parts of a document based on the similarity to a presumably related document. Ranks are then used to automatically construct the best anchor text for a link inside original document to the compared document. A number of di...

متن کامل

A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining

One of the existing difficulties of cross-language information retrieval (CLIR) and Web search is the lack of appropriate translations of new terminology and proper names. Different from conventional approaches, in our previous research we developed an approach for exploiting Web anchor texts as live bilingual corpora and reducing the existing difficulties of query term translation. Although We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002